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1  Overview 

This  final  report  outlines  the  past  year’s  study  of  an  Optoelectronic  system  for  space 
variant  image  and  signal  processing. 

Space-variant  transforms  (SVT),  such  as  the  Hough  transform1  (HT),  are  useful  in 
many  image  processing  applications  including  radar  detection  and  data  fusion3, 
topographical  map  analysis1  and  autonomous  robot  control5.  However,  the 
implementation  of  SVT  is  computation  and  memory  intensive.  Therefore,  for  efficient 
processing,  all-electronic  implementations  have  been  developed  using  parallel  multi¬ 
processor,  such  as  pyramid7,  mesh8,  multi-ring9  and  systolic  array10,  computer  systems. 

For  SVT  an  attractive  alternative  to  such  all-electronic  systems  are  optical  systems 
that  take  advantage  of  the  inherent  parallelism  of  optics.  Several  such  systems  have  been 
proposed,  including  the  use  of  cylindrical1  and  micro-lenses12,  rotationally  multiplexed3 
and  computer  generated  holograms14  (CGH).  For  digital  processing,  these  systems  utilize 
spatial  light  modulator  (SLM)  and  detector  (e.g.  CCD)  arrays  for  image  input  and  output 
(I/O),  respectively.  We  have  implemented  such  a  HT  processing  system  using  a  matrix  of 
CGH  15.  Each  hologram  in  the  array  is  designed  to  map  a  specific  image  pixel,  displayed 
on  an  SLM,  to  the  entire  CCD  pixel  array.  The  HT  for  the  entire  image  is  pre-computed 
and  pre-stored  in  the  form  of  a  CGH  array,  which  is  later  accessed  in  parallel  as  an  array 
of  space-variant  impulse  response  holograms.  Therefore,  the  optical  interconnection 
hologram  can  be  seen  as  a  page  oriented  optical  memory.  The  speed  and  efficiency  of  the 
SLM  and  detector  technology  utilized  dictate  the  SVT  image  processing  time. 

The  realization  of  high-resolution  real-time  systems  may  require  optoelectronic  (OE) 
processing  systems  that  utilize  fast  CMOS  driver  technology.  In  such  a  system,  a  VCSEL 
array  and  a  smart-pixel  focal  plane  array  can  be  used  for  I/O,  respectively, at  high  clock 
rates.  In  this  report  we  investigate  the  practical  implementation  of  such  a  system.  In 
Section  2  we  review  space  variant  transforms  and  line  detection  using  the  Hough 
transform.  In  Section  3  we  discuss  a  high-speed  optoelectronic  implementation  of  such 
transforms  and  in  Section  4  we  define  the  performance  metrics  we  use  to  characterize  the 
system.  In  Section  5  we  compare  the  optoelectronic  system  to  all-electronic  parallel 
systems.  In  the  last  section  we  present  a  summary  and  conclusions. 
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2  Space  Variant  Transforms 

A  SVT  affects  each  input  point  differently,  and  can  be  defined  by  a  2-D  superposition 
integral 


F{x,y)  =  J  J^/(£  rj)h(x,y ;£  rj)dE,dr],  (1) 

where  f{%,rj)  is  the  input  image  to  the  SVT  filter,  F(x,y )  is  the  output  image  and 
h(x,y;^,T])  is  the  SVT  filter  impulse  response.  For  SVT,  the  filter  is  dependent  on  the 
absolute  position  in  the  input  plane  and  varies  in  the  observation  output  plane.  Examples 
of  SVT  used  in  machine  vision  and  pattern  recognition  include  the  log-polat6,  reciprocal 
wedge17  and  HT. 

2.1  Straight  Line 

In  the  HT  each  point  from  the  image  (input)  domain  is  mapped  into  a  parametric  curve 
in  the  output  (parameter)  domain.  As  an  example,  we  review  a  HT  employed  for 
detection  of  a  straight  line  in  normal  parameterization.  A  straight  line  in  the  input  domain 
is  described  by 


f  1 ,  ( x,y )  g  Pq  -  xcos#0  +>’sin#0 

[0,  othe  rwise 


(2) 


where  p0  is  the  shortest  distance  to  the  origin  and  90  is  the  direction  of  p0  (see  Figure 
la).  Therefore,  the  parameter  domain  is  the  distance-angle  plane (9,p)  with  Cartesian 
coordinates  axes  p  and  9  (see  Figure  lb).  In  this  case  each  input  plane  point (x^y^  will 
be  mapped  into  a  sinusoidal  curve  in  th e(B,p)  domain 


p  =  XjCosO+yjSmO.  (3) 

Put  in  terms  of  the  SVT  of.  Using  Eq.  (3)  for  designing  the  SVT  filter,  Eq.  (1)  can  be 
rewritten  yielding  output  parameter  domain, 


2 


F{p,6)  =  |  S{p  -  x  cos  0  -  y  sin  6) f(x,y)dxdy .  (4) 


Figure  1.  Normal  parameterization  of  a  straight  line  where  the  (a)  input 
image  plane  has  three  points  along  a  line  and  the  (b)  parameter  plane  has 
three  corresponding  curves  intersecting  at (O0,p0). 


Each  point  in  the  input  plane  generates  a  sinusoidal  curve  in  the  output  domain  that 
spans  all  of  the  lines  that  may  pass  through  that  point.  Sinusoidal  curves  that  intersect  in 
the  parameter  plane  describe  a  line  passing  through  at  least  two  points  in  the  input  plane 
(see  Figure  lb).  Therefore,  multiple  points  lying  along  the  same  line  described  by 
(0q,Po)  in  the  input  will  result  in  multiple  curves  intersecting  in  the  parameter  plane  at 
the  point  (0o,po).  By  taking  the  sum  of  the  intersecting  points,  which  corresponds  to  a 
measurement  of  intensity  in  an  optical  system,  the  strength  (or  number  of  points)  on  that 
line  is  determined. 

2.2  General  Ellipse 

Parametric  curves  of  more  than  two  parameters  will  usually  require  a  higher 
dimensional  parameter  domain.  For  example  a  circle  is  defined  by  three  parameters  (i.e. 
the  coordinate  of  the  center  and  the  radius)  so  that  the  parameter  domain  is  a  3-D  space. 
Moreover,  for  the  detection  of  a  general  ellipse  which  is  a  parametric  curve  described  by 
five  parameters  (i.e.  the  coordinates  of  the  center  (xq,>'o)  the  length  of  the  two  axis  and 
the  orientation,  see  Figure  2a),  the  parameter  domain  is  a  5-D  space.  In  this  case,  digital 
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computer  implementations  of  the  HT  can  become  even  more  computing  and  memory 
intensive. 


K  P  U 


(a)  (b) 

Figure  2.  (a)  Parameterization  of  a  general  ellipse,  (b)  Simulation  of  the  HT 
parameter  plane  for  6-  0°:  pK  =  95,  pL  =25;  6  =  45°:  pP  -  1 10,  pQ  =  60; 

6  =  90°:  j Oy  =  113,  pv  =  7.  Solving  for  the  parameters  using  Eqs.  (10)  to  (14) 
gives  x0  =  60,  j/0  =  60,  /?  =  30°,  Ax  =  20  and  Ay  =  60 . 

We  have  recently  demonstrated  that  it  is  possible  to  evaluate  all  five  parameters  of  the 
general  ellipse  in  a  2-D  plan^8  using  a  2-D  filter  that  implements  the  HT  transform  of  a 
straight  line  in  normal  parameterization  (Eq.  (3)).  An  ellipse  can  be  described  by  the 
following  equations: 


x  =  x0+Ax  cos/?  cos  or  -  Ay  sin/?  sinor 
y  =  y0  +AX  sin/?  cosor  +  Ay  cos/?  sina  ^ 

where  Ax  and  Ay  are  the  two  axis  and  *o  and  yo  are  the  coordinates  of  the  center  of  the 
ellipse,  a  is  a  parameter  and  /?  is  the  angle  between  the^*  axis  of  the  ellipse  and  thex 
axis.  We  apply  the  HT  for  detection  of  a  straight  line  in  normal  parameterization  to  this 
parametric  equation  by  substituting  Eq.(5)  into  Eq.  (3)  to  get 
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(6) 


p  =  (x0  +  Ax  cos/?  cos  a  -  Ay  sin /?  sina)  cos# 

+  (y0+Ax  sin/?  cos  a  +  Ay  cos/?  sina)  sin# 

For  detection,  the  amplitude  distribution  in  the  parameter  domain  consists  of  a 
superposition  of  curves  that  is  described  by  Eq.  (6)  and  is  evaluated  at  a,  which  varies 
from  0  to  2k.  Since  p  is  a  function  of  both  #  and  a  for  any  given  curve  segment,  the 
envelopes  of  the  HT  in  the  ( 6,p )  domain  correspond  to  the  extreme  values,  described  by 

—  =  0  where  p  is  given  by  Eq.  (6),  that  are  obtained  by  changing  a  at  each  #.  Evaluating 
3d 

this  derivative  and  solving  the  resultant  equation  with  respect  loot  we  obtain: 


A tan(#-/?) 
tan  a  —  — - 

Ax 


(7) 


The  equations  of  the  curves  that  describe  the  envelopes  (upper  and  lower)  of  the  HT 
domain  are  obtained  by  substituting  Eq.(7)  in  Eq.  (6): 


\_ 

Penvei  =  xocos0+.yosin0±  [ax2  cos 2  (0-/3)  +  Ay2  sin2  (#-/?)] 2  .  (8) 

To  solve  for  the  5  unknown  parameters  we  evaluate  Eq.  (8)  at  three  independent  values  of 
6 1  (where  i  =  1,2,3 )  to  obtain  6  independent  equations.  Here  we  choose 

6 1  =  0°,45°,90°  with  the  six  corresponding  points  on  the  envelope  K,  L,  P,  Q,  U,  V: 


%  pk,l  =  *o  ±[4c2  C0SV  +  42  sinV]2 

A2  +  A2  (  (A  2 -A2)  sin2/? 
2  +  2 

J_ 

&90‘-  Pu,v  =  To  ±[42  sin2/?  +  Ay  cos2/?]2 


^45 :  PP.Q  ~ 


J2(x0  +To) , 


i_ 

2 


(9) 


where  pK  is  the  distance  to  the  top  and  pi  to  the  bottom  of  the  superposition  envelope 
when  0=0°  (see  Figure  2b).  Solving  for  the  parameters  we  get  five  equations: 
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*0  = 

To  = 


Pk  ±Pl 
2 

Pu  +Pv 
2 


P  =  —tan'1 
2 


2(Pp  Pq)2  ~(Pk  ~Pl)2  ~(Aj  ~Pv)2 
(Pk  -  Pl  )2  -  (Pu  “  Pv  )2 


4  - 


(Pk  ~Pl)2cos2  /MAj  -Py^sin^  p 
4cos  2/? 


2_*  2 


4  = 


(Pu  -py)2cos2  ^-(Pk  ~PL)2sin2  /? 
4cos  ip 


(10) 

(11) 

(12) 

(13) 

(14) 


One  of  the  advantages  of  this  parameterization  is  that  for  the  detection  of  a  general  ellipse 
only  three  angles  need  to  be  sampled  in  the  parameter  plane.  This  parameterization  and 
the  corresponding  detection  is  further  simplified  when^  =  Ay,  i.e.  the  ellipse  becomes  a 

circle  and  when  one  of  the  axes  goes  to  zero,  e.g.  Ax=  0,  i.e.  the  ellipse  becomes  a  line 
segment. 


Parameterizations  have  also  been  developed  that  allow  for  the  2-D  detection  of  a 
variety  of  geometric  shapes,  e.g.  hyperbolic  curved9,  as  well  as  the  2-D  iterative 
detection  of  3-D  geometric  shapes20.  The  Generalized  Hough  Transform  (GHT)  has  been 
developed  for  the  detection  of  shapes  that  cannot  be  described  analytical^1.  The  GHT 
has  also  been  developed  for  recognition  of  3-D  object^2. 


3  Optoelectronic  System  Implementation 

We  have  demonstrated  an  OE  processing  system  using  a  256<256  matrix  of  CGH 
filters  for  Hough  transform,  coordinate  transform,  and  optical  interconnect^5.  To 
construct  the  matrix  of  holograms,  the  CGH  encoding  of  the  SVT  for  each  pixel  position 
was  displayed  on  a  Hughes  liquid  crystal  light  valve  (LCLV)  and  optically  recorded,  in 
the  Fourier  transform  plane,  onto  Silver  Halide  holographic  film.  The  image  processing 
system,  shown  in  Figure  3,  used  the  LCLV  to  generate  the  input  image  plane,  which  is 
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imaged  onto  the  matrix  of  holograms.  The  output  is  reconstructed  onto  the  CCD  camera 
using  a  Fourier  lens. 


reconstruction 


Figure  3.  Optoelectronic  system  for  space-variant  image  processing.  A 
CRT  and  laser  are  used  to  generate  the  real-time  image  from  the  LCLV. 

The  outputs  from  the  holograms  are  reconstructed  off-axis  onto  the  CCD 
camera  using  a  Fourier  lens. 

We  have  also  demonstrated  an  OE  HT  image  processing  system,  for  detection  of 
straight  lines  and  ellipses,  using  a  64<64  matrix  of  computer  generated  holograms 
constructed  using  standard  micro-lithography  and  fabrication  technique^  .  To  efficiently 
encode  each  CGH  for  the  Fourier  transform  of  the  desired  filter,  we  used  a  modified 
version  of  the  direct  binary  search  (DBS)  method  combined  with  the  iterative  Fourier 
transform  algorithm24.  The  four-phase  level  CGH  was  fabricated  using  a  combination  of 
e-beam  and  photo-lithography  and  chemically  assisted  ion  beam  etching.  Each  hologram 
contains  128x128  pixels,  which  are  each  ^tmx5jim  (see  Figure  4).  The  micro-fabricated 
CGH  have  the  advantage  of  uniform  holographic  recording  of  each  pixel  as  well  as 
flexibility  of  CGH  design,  e.g.  allowing  for  on-axis  reconstruction. 
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Figure  4.  SEM  micrograph  of  a  portion  of  a  CGH  shows  the  4-phase  level 
structure  fabricated  on  a  quartz  substrate. 

Figure  5a-d  show  some  example  experimental  results  for  our  CGH  based  HT  image- 
processing  system.  Figure  5a  shows  the  video  image  of  multiple  4f  lines  and  Figure  5b 
shows  the  CCD  image  of  the  output  plane  of  the  CGH  HT  filter.  The  intensity  maxima  of 
the  superposition  of  the  multiple  HT  curves  all  lay  on  the  0-  45°  line  and  at  the  various 
values  of/?  coinciding  with  the  normal  parameterization  of  each  linePigure  5c  shows  the 
video  input  image  of  an  ellipse  centered  at  the  origin  and  Figure  5d  shows  the  CCD 
image  of  the  parameter  plane.  Notice  the  superposition  envelope  is  symmetric  about  the 
#-axis,  which  correctly  corresponds  to  x0  =  0  and  j>0  =  0  •  The  presence  of  a  zero-order 

component  in  the  center  output  images  (bright  spot)  indicates  imperfect  reconstruction  of 
the  HT  holograms  due  to  CGH  fabrication  errors. 

Next  generation  of  optoelectronic  systems  will  require  fast  I/O  to  perform  real-time 
processing  of  high-resolution  images  on  the  order  of  1024<1024  input  pixels.  Using 
VCSEL  and  smart  focal  plane  arrays  flip-chip  bonded  to  CMOS  drivers  allow  operation 
at  low  power  with  I/O  clock  rates  greater  than  150  MHz25.  The  SVT  of  the  input  image  to 
the  output  image  is  performed  using  diffractive  and  micro-optics.  By  using  the  same 
element  spacing,  the  VCSEL  and  hologram  arrays  can  be  placed  directly  next  to  each 
other  (see  Figure  6).  The  refractive  lens  may  also  be  replaced  with  an  array  of  micro¬ 
lenses.  Therefore,  the  components  and  subsystems  can  be  packaged  into  a  compact  and 
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rigid  optoelectronic  system.  Efficient  packaging  will  also  be  able  to  take  advantage  of  in- 
situ  recording  and  self-assembly  techniques. 


(c)  (d) 


Figure  5.  (a)  Video  input  image  shows  multiple  45s  lines  and  (b)  output  plane 
CCD  image  shows  superposition  of  HT  curves  intersecting  at  (9  =  45°.  (c)  Video 
image  of  ellipse  centered  at  origin  and  (d)  CCD  output  image. 

The  VCSEL  array  is  expected  to  act  as  a  partially  spatially  coherent  quasi- 
monochromatic  source.  A  unique  feature  of  using  these  VCSEL  arrays,  as  opposed  to 
traditionally  employed  SLM  implementations,  is  improvement  of  the  SNR  performance 
due  to  incoherent  superposition  at  the  output  (e.g.  reduced  speckle  noise).  Tunability  of 
VCSEL  arrays  would  also  enable  using  wavelength  multiplexing  of  volume  holography 
to  pre-store  multiple  transforms,  thereby  allowing  reconfiguration  for  efficiently 
implementing  a  number  of  SVT  algorithms.  Since  such  systems  are  generally  used  for 
line  and  shape  detection  (where  missing  pixels  have  a  small  effect  on  detection 
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efficiency),  relatively  low  yield  VCSEL  arrays  can  be  tolerated  allowing  for  the  use  of 
large-scale  arrays. 


VCSEL 


lens 


of  CGH 


smart-pixel 

array 


Figure  6.  Optoelectronic  SVT  image  processing  system  uses  VCSEL  and 
smart-pixel  arrays  bonded  to  CMOS  drivers  for  fast  I/O.  The 
reconstruction  lens  can  be  replaced  by  diffractive  micro-lenses,  allowing 
for  a  more  integrated  and  compact  package. 


4  Performance  Metrics 


Transform 

Domain 


Figure  7.  Dimensions  of  the  image,  transform  and  parameter  domains.  The 
dimension  of  the  input  image// can  be  greater  than  the  VCSEL  array  dimension 
F.  Likewise  the  output  domain  can  be  larger  than  the  detector  array,  requiring 
some  form  of  block-serial  mapping. 


10 


w uuuu* 


In  order  to  analyze  the  capabilities  of  such  an  OE  processing  system  we  define  a  set  of 
performance  metrics  that  can  be  compared  to  other  existing  SVT  image  processing 
systems.  We  begin  by  defining  the  dimensions  of  our  system.  Let  the  image  (input) 
domain  be  of  size  NxN,  the  transform  domain  is  K x K  and  the  parameter  (output) 
domain  be  of  size  MxM  (see  Figure  7),  where  for  simplicity  we  have  assumed 
symmetric  arrays.  The  VCSEL  (SLM)  array  is  of  sizeFxF  (i.e.  the  input  image  can 
have  a  different  dimension  than  the  VCSEL  array)  and  the  smart-pixel  (detector)  array  is 
DxD,  then  for  a  fully  space  variant  transformF  =  K  and  F<D. 

4.1  Space  Bandwidth  Product 

The  dimensionsFand  K,  the  input  image  and  output  planes,  will  be  constrained  by  the 
attainable  pixel  size  of  the  CGH  elements.  Assuming  the  output  from  each  VCSEL  fills 
the  area  of  its  corresponding  CGH,  then  the  2-D  space  bandwidth  product  (SBWP)  of  a 
single  hologram  can  be  given  bySBWPsingie  =  D2C 2  where  C  is  the  spatial  encoding  (e.g. 

for  a  64x64  image,  a  single  hologram  with  128<128  pixels  results  in  a  spatial  encoding 
C  =  2).  Therefore,  for  the  entire  matrix  of  holograms  we  obtain 

SBWP  =  K2D2C2 .  (15) 

We  can  also  define  the  SBWP  in  terms  of  the  physical  dimensions  of  the  matrix  of 
holograms  so  that 


SBWP  = 


(16) 


where  lh  is  the  length  of  the  hologram  matrix  and  8  is  the  size  of  each  pixel  in  the  CGH 
(minimum  feature  size).  If  we  assume  a  SVT  where  F  =  D,  then  we  can  say 


F  = 


(17) 
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For  example,  if  we  assume  that  the  matrix  of  holograms  has  a  length 4  =  100  mm,  a 
minimum  feature  size  8  =  1  pm  and  encoding  C  =  6,  then  K,D~  128  (i.e.  the 
input/output  can  be  of  maximum  dimension  128<128). 

4.2  Footprint  Area  and  Volume 

The  length  of  the  system  will  depend  on  the  size  of  each  component  as  well  as  the 
distance  between  components.  Assume  each  VCSEL  has  an  aperture  a  and  a  pitch  p,  then 
the  distance  between  the  VCSEL  array  and  holograms  can  be  approximated  by 

pa 

Zcgh=2A  (18) 

(assuming  the  VCSEL  array  pitch  is  equal  to  the  pitch  of  the  hologram  array,  then 
The  distance  from  the  reconstructing  lenses  to  the  output  plane  is  given  by 

za=f  =  dF/#,  (19) 

where /is  the  focal  length  (if  we  assume  a  refractive  lens,  then  the  diameter^  >  J2lh). 

Since  each  pixel  is  Fourier  transformed  onto  the  output  plane,  we  can  describe  the  size  of 

X f 

the  detector  array  by  the  relation  /d  =  The  size  of  each  detector  in  the  smart-pixel 

8 

%  l  I  ^ 

array  is  8d  =  —  and  the  active  detection  area  is  restricted  to  <  Sd  (we  assume  a  densely 
packed  detector  array  with  processing  performed  outside  the  detection  area). 

The  product  of  its  overall  length  and  width  gives  the  footprint  of  the  system 

SF=z-l  =  {(tv  +  zcgh+th  +  tR  +  z0  +o)(/max)}>  (20) 

where  ty,  th,  td  and  tR  are  the  thickness  of  the  VCSEL,  hologram  and  smart-pixel 

detector  arrays  and  the  refractive  lens,  respectively.  Substituting  from  the  relations  above 
we  get 
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■S/r-j  ty+^  ^  ~  +  th+tR+  -j2lhF/#+td 

where  we  assume  the  maximum  width  to  be  of  order//,.  If  we  continue  to  assume  a 
symmetric  system,  then  the  volume  of  the  system  is  defined  by 


=  s  \thv  + 


+  thfj  +  thR  +  yfZl/jF/tt+thj  (lfj ) 


Let’s  assume  the  VCSEL  and  smart-pixel  arrays  have  thickness  of  0.5  cm  (including 
CMOS  layer),  the  holographic  array  has  thickness  0.1  cm  and  the  refractive  lens  has  a 
thickness  of  1.0  cm.  Also,  we  assume  F/l,  A  =  0.980  /an ,  8- 1  /an,  K,D=  128  and 
C  =  6.  The  hologram  width  will  be  lh  =  10  cm,  which  gives  a  focal  length /=  14  cm .  If 
the  VCSEL  diameter  is  a  =  3  /an,  then  the  distance  between  the  VCSEL  and  hologram 
array  z0  =  l.l  mm.  The  total  length  of  the  system  is  z  =  16cm,  which  results  in  a 
footprint  of  SF  ~  160  cm  and  volume  of  Sv  ~  1600  cm  .  From  Eq.  (21)  we  see  that  the 
system  dimensions  primarily  depend  on  the  size  of  each  pixel,  dictated  by  the  minimum 
attainable  feature  size  of  the  diffractive  optical  element  (DOE),  along  with  the  F/#  of  the 
reconstruction  lens.  For  example,  reducing  the  pixel  size  toJ  =  0.5  /an  will  reduce  the 

total  length  to  z  =  9  cm ,  the  footprint  to  SF  ~  45  cm2  and  the  volume  to  Sv  ~  225  cm3. 
On  the  other  hand,  using  a  F/2  lens  will  almost  double  the  system  dimensions.  An 
attractive  alternative  to  the  thick  refractive  lens  is  to  use  a  relatively  thin  diffractive  lens 
(i.e.  CGH)  to  perform  the  Fourier  transform  from  the  HT  plane  to  the  output  plane. 

4.3  Power  Dissipation 

To  analyze  the  power  requirements  of  the  system  we  begin  with  the  smart-pixel 
detectors.  We  can  approximate  the  power  requirements  of  each  pixel  by  the  relation 

_ucM tdV  \ 

P±t~R{  dt  4akJ’  (23) 
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where  R  is  the  responsivity.  If  we  assume  a  leakage  current  7leak  =  100  pA,  detector 
capacitance  Cdet  =  30  fF  and  responsivity  R  =  0.3  A/w,  then  to  get  10  mV  at  500  MHz 
clock  rate  should  require  an  incident  optical  power  ofPdet  =  0.5  //W.  Dines26  has  shown 
that  smart  pixels  operating  at  greater  than  100  MHz  clock  rate  require  a  detector  area  of 
only  50  /mi  x  20  /mi  and  less  than  10  £1  of  incident  energy.  At  a  500  MHz  clock  rate  this 
corresponds  to  /^et  ~5  /rW.  Since  each  VCSEL  needs  to  power  a  minimum  of  128 
smart-pixels  in  the  detector  array,  then  assuming  50%  efficiency  of  the  DOE,  the  required 
optical  power  from  the  VCSEL  is  between  Pv  =  0.1  - 1.0  mW.  For  low  threshold  VCSELs 
we  can  assume27  a  slope  efficiency  of  55%  and  a  threshold  current IT  =  212  //A.  Using 
the  conservative  upper-bound  estimate  Pv  =  1  mW ,  then  the  required  current  per  VCSEL 
is  I  =760  juA.  Using  3V  CMOS  driving  circuitry  would  require  2.3  mW  of  driving 
power.  Since  only  one  row  is  addressed  at  any  one  time,  then  for  N  =  128  where  every 
pixel  is  in  the  bn’ state  this  corresponds  to  P  <  0.3  W . 

To  determine  the  thermal  performance  of  such  a  system  we  assume  that  the  0.3  W  of 
power  will  need  to  be  dissipated  on  the  end  surfaces.  For  lh  =  10  cm,  the  surface  area  of 

each  end  is  S,£  =  100cm2.  Therefore  the  VCSEL  array  will  need  to  dissipate 
approximately  3  We  can  assume  that  the  smart-pixel  array  will  need  to  dissipate  the 
same  order  ofpower28  over  a  similar  surface  area  (on  the  opposite  side  of  the  OE  system). 


4.4  Processing  Time 


The  performance  metric  that  can  most  easily  be  compared  to  alternative  systems  is  the 
amount  of  time  required  to  SVT  process  a  single  input  image.  For  a  CMOS  driver 
operating  at  clock  rate  f,  we  can  assume  that  the  VCSEL  response  time  is  on  the  order  of 


r  = 


f 


.  Using  parallel-matrix  addressing,  i.e.  where  only  one  pixel  per  row  can  be 


addressed  at  a  time,  then  the  load  time  of  an  input  image  is  tk 


K 

f 


.  The  smart-pixel 


detection  time  can  be  described  by  td  = 


D  +  P 


where  D  is  the  number  of  rows  in  the 
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detector  and  P  is  the  number  of  clock  cycles  required  to  perform  some  local  processing 
and  A-D  conversion.  The  total  processing  time  for  N  =  K\s  then  given  by  the  sum 


,  K+D+P 
T  = - 

/ 


(24) 


For  a  system  where  the  dimension  of  the  input  domain  is  greater  than  the  filter  domain, 
N  >  K  (e.g.  a  large  input  image  or  wavelength  multiplexed  processing),  then  the  system 
must  perform  some  block-serial  mapping  and  can  be  described  by 


yr// 


K  +  D  +  P 

f 


(25) 


Similarly,  for  a  parameter  domain  larger  than  the  filter  domain,  M>  D  (e.g.  for  an 
asymmetric  SVT),  then  the  total  processing  time  can  be  given  by 


K+D+PfNM^ 
f  \KDj 


(26) 


For  large  image  sizes  we  can  assume  that  the  number  of  clock  cycles  required  for  local 
processing  is  small  compared  to  the  image  size,  i.e.P  «  K,D.  There  are  three  cases  that 
we  will  examine: 

1.  The  input  and  output  dimensions  are  equal  to  (or  smaller)  than  the  filter  domain,  i.e. 
N  =  K  =  D=  M,  then  the  total  processing  time  reduces  to 


Ti 


2  N 

f  ‘ 


2.  The  input  dimension  is  larger  than  the  filter  dimension,  i.e N  >  K ,  then 


(27) 


2 K 

f  ' 


(28) 
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3.  The  input  and  output  dimensions  are  larger  than  the  filter  dimension,  i.e.N  >  K  and 
M>  D.  However,  we  will  assume  that  the  input  and  output  are  of  the  same 
dimension,  i.e.  N  =  M  and  K  =  D,  then 


r  _  f  N\4  2K 

3  UJ  /' 


(29) 


In  Figure  8  we  have  plotted  the  processing  time  for  each  of  the  three  cases  discussed  and 
we  see  that  there  is  an  exponential  increase  in  the  processing  time  when  the  input/output 
sizes  are  larger  than  the  holographic  filter.  For  example,  assume  an  input  image  of 
512x512  and  CMOS  clock  rate  of  500  MHz.  If  the  matrix  of  holograms  is  also  5 1 2x5 12 
then  the  total  processing  time  will  be  7]  =  2.0  /js.  However,  for  a  SVT  filter  of  size 
128x128,  the  processing  time  is  increased  to  7^=130//s  (where  the  output  is  of 
dimension  512x512). 


Figure  8.  Log  of  computing  time  in  units  of  K/fas  a  function  of  N/K. 
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4.5  Throughput 

Another  convenient  measure  of  performance  is  throughput  or  the  number  of  binary 
operations  per  second.  Each  bit  (pixel)  in  the  input  plane  is  SVT  mapped  onto  every  pixel 
in  the  output  plane,  so  the  total  number  of  operation  can  be  described  by 


J. — Y— f . 

T  KK  +  D  +  PANMJ 


If  we  assume  that  K  =  D  and  P  «K,D,  then  this  simplifies  to 


(30) 


iC3/ 


(31) 


For  a  filter  of  size  128x128  image  and  500  MHz  clock  rate  the  number  of  binary 

.,4  bits 


operations  is  0  =  5.2x10 


sec 


This  large  bandwidth  is  representative  of  the 


parallelism  of  an  optics-based  processing  system. 


5  Comparison 

For  comparison  of  our  proposed  OE  SVT  image  processing  system  to  all-electronic 
systems,  we  will  assume  a  relatively  conservative  OE  system  configuration  that  has  a 
128x128  VCSEL  array  for  input,  a  128x128  array  of  holograms  for  SVT  filtering,  a 
128x128  smart-pixel  detector  array  and  CMOS  circuitry  operating  at  a  500  MHz  clock 
rate.  For  such  a  system  a  512x512  image  will  be  SVT  processed  in  approximately  130|is 
and  a  1024x1024  input  image  will  be  processed  in  1.0  ms.  A  proposed  OE  system  using 
ferroelectric  liquid  crystal 1  (FLC)  input  devices  and  CGH  encoded  HT  filters  is  capable 
of  processing  a  512x512  image  in  1  ms,  limited  by  the  speed  of  the  SLM. 

The  all-electronic  SYMPATI2,  developed  in  1992  by  CEA/LETI/DEIN,  which  uses 
the  single  instruction  multiple  data  (S1MD)  architecture,  employs  128  parallel  processing 
elements  (PE)  reporting*  the  HT  image  processing  of  a  512x512  input  image  (for  512 
angles)  in  approximately  1  second.  CEA  has  also  developed  the  second-generation  SIMD 
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device.  SYMPHONIE.  Using  256  PE  it  is  unofficially  reported  to  be  able  to  HT  a 
256x256  input  image  (for  256  angles)  in  30  ms.  There  is  also  a  1024  PE  SYMPHONIE, 
however  no  performance  data  is  available  on  this  device. 

More  recently,  a  HT  board,  utilizing  a  reconfigurable  parallel  architecture  and  two 
content  addressable  memory  (CAM)  chips,  reported50  line  extraction  of  a  256x256  input 
image  in  76  ms.  Using  a  reconfigurable  multi-ring  network  (RMRN)  architecture  where 
each  node  is  a  T9000  transputer,  the  HT  operation  on  a  1024x1024  input  image  is 
estimated9  at  14  seconds,  using  128  nodes.  Performing  the  HT  for  straight-line  detection 
in  a  512x512  input  image  (with  2048  angles)  using  the  Systola  1024,  a  systolic  array  of 
1024  processors  implemented  on  a  standard  PCI  board,  took10  0.25  seconds.  This 
corresponds  to  a  processing  time  of  62.5  ms  for  512  angles. 

An  alternative  to  these  electronic  DSP  methods  for  HT  processing  is  a  strictly  memory 
lookup  method.  To  analyze  the  performance  for  a  HT  processing  system  that  uses  a 
lookup  table,  i.e.  storing  the  HT  ( 6,p )  results  for  all  {x,y)  input  points,  we  begin  by 
defining  the  dimensions  of  the  storage  required.  Assuming  a  binary AxA  input  image, 
then  the  input  is  A  bits.  Each  input  point  maps  a  sinusoidal  curve  to  the  output,  i.e.  A 
angles  6.  For  6?  each  we  need  a  word  of  length  log2  A  to  distinguish  between  A  possible 
p.  Therefore,  the  total  storage  required  is 

5=  A3log2  A  Bits.  (32) 

Since  each  angle  is  independent,  we  divide  the  transforms  by  angle  into  multiple  memory 

A 

chips  (R)  (i.e.  each  chip  determines#  =  —  angles),  giving  the  storage  for  each  chip 


yMogjAr  B. 
R 
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For  A  =  1024 ,  the  storage  required  is  S  =  bits. 


(33) 
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For  a  lookup  method,  the  HT  can  be  described  by  an  algorithm  consisting  of  the 
following  steps: 

1.  Read  input  (x,jy)  value,  assume  1  clock  cycle. 

2.  Read  HT  ( 9,p )  value,  1  clock  cycle. 

3.  Read  existing  output  array  value (#0id’ A)id)>  1  clock  cycle. 

4.  Increment  output  array  value  (#0id>  A)id)>  assume  2  clock  cycles. 

5.  Write  new  output  array  value  (#new>  Aiew)>  1  clock  cycle. 

In  the  steps  described  above  we  use  assume  that  reading  the  input  and  memory  resident 
values  of  the  HT  can  be  done  in  one  clock  cycle.  Steps  3-5  can  be  described  together  as 
writing  the  new  value  to  the  correct  position  in  the  output  array  (see  Figure  9).  In  total, 
we  assume  only  six  clock  cycles  at  every  angle  for  each  input  point. 


Figure  9.  Increment  of  output  array  for  each  angle.  The  initial  or  existing  array 
value  is  read,  incremented  and  the  new  value  written  to  the  output  array. 
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Therefore,  we  can  estimate  the  total  time  required  processing  of  aniVxjV  input  image  to 
be 


(34) 

where  /  is  the  clock  rate  of  the  memory  chips  and  we  assume  that  R<N.  For 
comparisons  to  alternative  systems  we  will  assume  our  memory  to  be  Rambus  RDRAM 
chips,  capable  of  operating  on  a  word  (~log2  N )  in  parallel  at  1  Giga  word  per  second 
average  clock  rate.  Therefore,  for  TV  =  512,  and  assuming  =  16,  the  total  processing 
time  is  T  =  0.05  seconds  and  for  A  =  1024  the  processing  time  is  T  =  0.4  seconds.  To 
compare  this  lookup  method  with  our  proposed  OE  system  we  plot  the  processing  times, 
using  Eqs.  (27),  (28)  and  (34),  in  Figure  10.  Comparing  ideal  systems,  where  N  =  K,R 
the  OE  system  increases  its  advantage  in  processing  speed  as  the  dimensions  of  the  input 
image  increase.  However,  for  an  OE  system  with  a  small  number  of  holograms,  the 
exponential  increase  in  processing  time  decreases  its  advantage  over  the  strictly  memory 
lookup  table  system  with  a  limited  number  of  memory  chips. 
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Figure  10.  Log  plot  shows  processing  times  for  RDRAM  memory  resident 
and  OE  processing  systems  as  functions  of  input  image  dimension. 


Table  1.  Processing  times,  given  in  seconds,  for  various  SVT  image- 
processing  systems.  For  comparison  we  computed  a  serial  implementation 
of  the  HT  on  a  SunSPARC20  workstation  with  a  100  MHz  processor. 

In  Table  1  we  compare  the  processing  times  for  the  various  systems  discussed  above. 
The  all-electronic  systems  can  perform  the  HT  on  the  order  of  milliseconds  for  images  of 
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size  up  to  512x512  and  on  the  order  of  seconds  for  larger  input  images.  On  the  other 
hand,  our  proposed  OE  system  can  perform  the  HT  on  images  of  dimension  1024<1024 
in  a  few  milliseconds,  required  for  real-time  processing. 

6  Conclusions 

The  research  results  of  this  study  can  be  summarized  as  follows: 

1.  An  optoelectronic  SVT  image  processing  system,  using  CMOS  driven  VCSEL  and 
smart-pixels  for  I/O  and  a  filtering  performed  by  a  matrix  of  computer  generated 
holograms,  has  been  presented. 

•  The  incoherence  between  pixels  of  a  VCSEL  based  SLM  can  improve  the  SNR  of 
the  OE  processing  system  compared  to  coherent  SLM. 

2.  We  have  identified  three  SVT  considered  highly  useful  for  image  and  signal 
processing:  the  Log-Polar,  Reciprocal  Wedge  and  Hough  Transforms. 

3.  In  order  to  analyze  the  strengths  and  weaknesses  of  our  proposed  OE  system  we 
defined  a  set  of  performance  metrics.  From  this  analysis  we  find: 

•  The  dimensions  of  the  OE  system  are  primarily  constrained  by  the  minimum 
feature  sizes  attainable  using  micro-fabrication  techniques.  However,  using 
relatively  conservative  estimates  our  proposed  system  is  highly  compact 
(SF  ~  45  cm  ,  Sv  ~  225  cm  )  and  can  easily  fit  onto  a  portion  of  PC  add-on 
board. 

•  Since  only  one  row  is  addressed  at  any  one  time,  the  power  consumption  of  a 
128x128  VCSEL  array  is  a  relatively  low  value  of  0.3  W. 

•  Using  currently  available  VCSEL,  smart-detector  and  CMOS  technology  our 
proposed  OE  processing  system  can  perform  HT  on  a  128x128  image  in  0.51  fxs. 
For  larger  images  block-serial  mapping  can  be  used.  Therefore,  for  a  1024<1024 
input  image  a  HT  would  take  only  2.1  ms. 
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4.  The  exponential  growth  of  the  processing  times  for  an  OE  system  using  relatively 
small  arrays  is  the  greatest  limitation  to  SVT  processing  of  very-large  and  high- 
resolution  images. 

•  However,  since  these  systems  can  tolerate  low  yield  VCSEL  arrays,  or  even 
stitching  of  multiple  small  arrays,  it  may  be  feasible  to  have  VCSEL  or  detector 
arrays  on  the  order  of  1024x1024. 

•  For  an  input  image  size  of  1024<1024  into  a  SVT  on  an  OE  processing  system 
with  VCSEL,  hologram  and  smart-pixel  arrays  all  of  dimension  1024c  1024 
(stitched  or  low-yield),  the  processing  time  is  estimated  at  only  4.1p.s. 
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ABSTRACT 

An  optical  implementation  of  the  Hough  transform  (HT)  based  on  a  matrix  of  64x64  four-level  phase  computer  generated 
holograms  (CGH)  is  described.  The  HT  holograms  are  designed  using  a  novel  algorithm  that  combines  the  high  speed  of 
convergence  of  iterative  Fourier  algorithm  (IFTA)  with  the  high  precision  of  the  direct  binary  search  (DBS)  algorithm.  The 
matrix  of  holograms  was  fabricated  using  standard  microfabrication  techniques:  E-beam  lithography,  photolithography  and 
chemically  assisted  ion  beam  etching.  The  fabricated  elements  were  characterized  experimentally  and  used  in  the  HT  processor, 
demonstrating  practical  application  examples  as  real-time  straight  line,  ellipse,  circle  detection,  and  other  pattern  recognition 
tasks. 

Keywords:  Hough  transform,  pattern  recognition,  computer  generated  holograms 

1 .  INTRODUCTION 

Hough  transform  (HT)  techniques  are  widely  used  in  pattern  recognition  tasks  for  detecting  parametrized  templates  such 
as  lines,  line  segments,  circles  and  ellipses  in  binary  edge  images.  A  survey  of  the  HT  applications  and  realizations  are 
presented  by  Illingworth  et  al. 1  and  by  Leavers2.  The  HT  maps  the  image  space  into  a  parameter  space  such  that  a  peak 
occurring  in  the  output  parameter  space  means  possible  occurrence  of  a  specific  sought  pattern  in  the  input  picture.  The 
advantage  of  the  HT  in  contrast  to  classical  correlation  techniques  is  its  stability  against  noise  and  occlusion  arising  in 
digitized  images.  Although  invented  in  19625,  the  HT  method  is  still  an  active  topic  of  research  in  the  scientific  community 
with  more  than  50  papers  published  in  1996.  They  concern  military  applications  (such  as  target  tracking4,  robotics  vision  and 
autonomous  vehicle  navigation5-6,  etc.)  as  well  as  civil  applications  (such  as  medical  imaging7-8,  geological  analysis  of 
grounds^,  character  recognition  tasks*  supervision  of  the  quality  in  industrial  production* ',  etc.). 

The  main  part  of  these  publications  aims  at  improving  time/memory  space  requirements  that  are  consumed  when  pure 
electronic  implementation  is  used  for  realization  of  the  HT  algorithm.  Special  mathematical  techniques  that  reduce 
computation  complexity  have  been  developed  to  increase  the  speed  of  realizing  the  HT  algorithm  electronically  (e.g.,  direction 
and  amplitude  gradient  operators,  contour  following,  etc.).  However,  as  a  consequence,  such  mathematical  techniques  may 
cause  loss  of  accuracy  in  the  results  and  lead  to  a  coarser  resolution.  Some  electronic  implementations  use  specialized  chips  or 
specialized  computers  leading  to  high  performance  HT*2.  However  these  processors  not  only  remain  expensive,  but  they  are 
also  slow  when  applied  to  large  size  images,  thereby,  limiting  various  real  time  applications.  For  example,  implementation 
of  HT  for  an  image  size  NxN  using  a  serial  electronic  computer  will  require  thresholding  a  composite  image  resulting  from  a 
superposition  of  NxN  images,  each  of  size  NxN,  thereby  requiring  0(N4)  operation  and  0(N4)  storage.  A  more  expensive  but 
better  performing  solution  uses  a  parallel  computer.  A  parallel  electronic  computer  implementation  of  HT  on  a  single 
instruction  multiple  data  (SIMD)  machine  with  a  single  processing  element  (PE)  for  each  input  image  pixel  will  speedup  the 
HT  calculation  by  a  factor  of  0(N2).  However  it  is  not  realistic  to  assume  that  an  array  of  PEs  equal  to  the  number  of  pixels 
in  a  high  resolution  will  be  realistic.  For  example,  the  SIMD  machine  SYMPATI2  designed  by  CEA/LETI/DEIN  has  an 
SIMD  architecture  with  128  PEs  with  an  original  concept  of  data  distribution  on  the  processors  (helicoidal  addressing).  In  this 
configuration  for  a  512x512  pixel  image  the  HT  in  1  angular  direction  (angle  0  fixed)  is  computed  in  3  ms  and  for  256 
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directions  this  computing  time  becomes  0.5  sec.  In  comparison,  the  proposed  optical  implementation  can  achieve  a  higher 
processing  speed.  The  input  plane  spatial  light  modulator  (SLM)  can  be  replaced  by  a  ferroelectric  SLM  operating  at  10  kHz 
and  the  output  camera  by  an  array  of  photodetectors.  In  contrast,  optical  implementation  of  HT  algorithm  is  attractive  because 
it  resolves  computation  speed  problems  due  to  its  inherent  advantages  in  implementing  interconnections  in  parallel  and  at 
very  high  speed.  This  paper  describes  an  optical  implementation  of  the  HT  using  a  matrix  of  computer  generated  holograms 
(CGH).  The  next  Section  briefly  introduces  the  HT  generally  used  for  detecting  straight  lines  in  input  patterns,  followed  by  a 
review  on  optical  implementations  of  HT.  The  design,  fabrication  and  testing  of  the  HT  CGH  are  presented  in  Section  3.  The 
optical  HT  processor  employing  the  HT  CGH  as  well  as  the  experimental  results  are  provided  in  Section  4,  followed  by 
concluding  remarks  in  Section  5. 


2.  PRINCIPLE  OF  HOUGH  TRANSFORM 


2.1.  Detection  of  a  straight  line 

For  detecting  lines  in  the  input  images,  a  mapping  between  a  Cartesian  input  plane  (x,y)  and  a  Cartesian  normal 
parametrization  output  plane  (0,p)  is  used.  The  main  advantage  of  this  mapping  lies  in  the  regularity  and  uniformity  of  the 
parameters  space  in  comparison  with  the  slope-intercept  representation. 

Figure  1  illustrates  the  principle  of  normal  parametrization  of  a  straight  line  f(x,y)  defined  in  the  input  image  plane  by 

ft  x-J1,  (*’>’)  e  A)  =xcosdo  +Tsin0o 

^  jo,  otherwise 

(1) 

where  p()  is  the  shortest  distance  from  the  line  to  the  origin  and  0(|  is  the  angle  of  p()  with  respect  to  the  x-axis. 

The  HT  image  F(0,p)  corresponding  to  input  image  f(x,y)  is  defined  by  the  equation 
F(p, 6)  =  J  J  f(x,  y)  ■  5(p  -  x  cos  6  -  y  sin  B)  dxdy 
(2) 

where  <5(p-  *cos0-ysin0)  is  a  delta-function  describing  the  HT  point  spread  function  (PSF). 
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a)  b) 

Figure  1 :  Principe  of  the  straight-line  HT  in  normal  parametrization:  a)  input  plane  b)  parameter  domain  plane 

The  principles  of  HT  are  summarized  in  Figure  1  where  for  each  point  A,  with  coordinates  (x^)  in  the  image  space,  there  is 
a  corresponding  sinusoidal  curve  B,  in  the  (0,p)  parameter  space.  Collinear  points  A„  A2,  A3  give  rise  to  curves  that  intersect 
in  the  output  plane  at  one  point  (0o,po),  which  characterizes  the  parameters  of  a  line  defined  by  these  collinear  points.  This 
intersection  is  counted  in  an  array  of  accumulator  cells  that  performs  quantization  and  storage  of  the  parameter  space.  Thus, 
the  parameters  of  a  line  in  the  input  plane  are  detected  using  a  thresholding  operation  in  the  parameter  domain. 
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1 .2.  Detection  of  higher  order  parametric  curves 


Parametric  curves  described  by  more  than  two  parameters  will  usually  require  a  higher  than  two  dimensional  parameter 
domain.  For  detecting  a  general  ellipse  described  by  five  parameters  (i.e.,  the  coordinates  of  the  center  (x(1,yo),  the  length  of  the 
two  axis  a,b,  and  the  orientation  (p),  the  parameter  domain  requires  a  5-D  space.  In  the  same  way,  a  circle  needs  a  3-D 
parameter  space  (i.e.,  the  coordinates  of  the  center  (x0,y0)  and  the  radius  r).  However  it  has  been  shown  ^ 3  that  a  2-D 
mathematical  description  of  higher  order  parametric  curves  such  as  ellipse  or  circle  is  possible  in  a  Cartesian  normal 
parametrization  output  plane  (0,p).  In  that  case,  the  superposition  of  the  sinusoidal  curves  no  longer  intercepts  in  one  point. 
They  rather  form  a  pattern  with  an  envelope  that  needs  to  be  analyzed  at  different  positions  in  the  output  plane  to  extract  the 
parameters  of  the  input  template  curve.  However  we  choose  to  restrict  our  present  study  to  detection  of  circles  and  ellipses 
oriented  at  0°  or  90°;  the  general  case  of  an  (p-oriented  ellipses  detection  is  explained  in13.  The  four  parameters  of  such  an 
ellipse,  i.e.  coordinates  of  the  center  (x0,y0)  and  the  two  axis  a  and  b  are  determined  from  the  four  measured  coordinates  of  the 
two  envelopes  in  the  HT  plane  at  two  0-locations  :  0=0°  and  0=90°  (see  Figure  2).  For  the  circle  case,  the  two  ellipse  axis 
parameters  will  appear  equal,  r=a=b. 
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Figure  2:  HT  for  detecting  an  90°-oriented  ellipse:  a)  input  ellipse  b)  simulation  of  the  HT  parameter  plane.  Coordinate  of  the  center: 
X0  ss  (A  +  B)/2.  y0  =  (C  +  D)/2;  Length  of  the  two  axis:  a  =  {A-  £)/2,  b  =  (C  -  D)  /2. 

1 .3.  Optical  implementation  of  the  Hough  transform 

Various  methods  for  the  optical  implementation  of  the  HT  have  already  been  described  in  the  literature.  One  type  of 
implementation  uses  mechanically  rotating  parts  which  are  limiting  the  processing  speed  14>15.  Roux  has  introduced  a  HT 
processor  based  on  a  coordinate  transform  CGH^.  Casasent  et  al.  have  proposed  several  optical  systems  for  the 
implementation  of  the  HT17’18’19.  One  of  these  implementations  uses  a  CGH  consisting  of  pairs  of  cylindrical  lenses 
oriented  at  different  angles^’^.  In  this  case,  each  pair  of  lenses  is  computing  the  HT  in  one  specific  angular  position.  The 
parameter  domain  is  in  a  polar  coordinates  format  which  may  cause  difficulty  for  practical  realizations  using  conventional 
optoelectronic  detector  arrays.  Moreover,  this  output  plane  format  makes  any  cascadability  unrealistic.  However  this 
implementation  needs  a  smaller  space  bandwidth  product  (SBWP)  for  the  CGH,  although  it  introduces  a  disadvantage  of 
variable  resolution  in  0  and  p  in  the  parameter  domain. 

In  this  manuscript,  we  focus  on  the  principle  and  the  realization  of  the  HT  using  a  two-dimensional  array  (i.e.,  matrix) 
composed  of  CGHs.  In  this  case,  for  each  input  pixel  with  coordinates  (x^),  we  use  a  corresponding  CGH  which 
reconstructs  in  the  Hough  transform  plane  the  corresponding  sinusoidal  curve, 

p  =  jqcos0  +  yysin0 

(3) 

Thus  the  HT  of  an  input  image  is  obtained  from  intensity  summation  of  the  PSFs  reconstructed  from  the  hologram 
corresponding  to  each  input  image  pixel  (see  Figure  3).  Since  the  HT  is  a  space- variant  transformation,  each  hologram  in  the 
array  is  different  from  its  neighbors. 
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Figure  3:  Principle  of  the  optical  HT  processor  using  a  CGH  matrix. 

Our  approach  is  unique  because  it  introduces  an  advantageous  solution  to  the  computation  intensive  procedures  involved  in 
implementing  electronically  space-variant  transformations  in  general,  and  HT  in  particular.  Such  an  optical  processor 
represents  a  compact  and  motionless  implementation,  involving  a  simple  and  packageable  optical  system.  This 
implementation  is  the  most  closely  related  to  the  numerical  implementation  on  electronic  computers.  With  our  approach,  we 
anticipate  obtaining  results  similar  to  electronic  realizations  in  terms  of  accuracy,  but  at  a  much  higher  speed. 

The  main  advantage  of  our  processor  lies  in  its  flexibility  and  versatility.  Such  a  matrix  can  implement  detection  of  numerous 
primitive  patterns  (e.g.,  circles,  ellipses,  and  straight  lines)  by  properly  analyzing  the  parameter  domain.  Furthermore, 
detection  of  an  object  in  specific  angular  positions  can  be  achieved  by  a  local  observation  in  the  parameter  domain. 
Cascadability  of  the  processor  is  also  possible  because  the  output  parameter  domain  can  be  arbitrarily  tailored  at  the  design 
stage  to  meet  the  requirements  of  the  next  stage  in  the  cascade.  Ambs  et  al.  13,22,23  have  ajready  preSented  an  optical  HT 
implementation  using  a  two-dimensional  array  of  holograms.  However,  the  quality  of  the  optically  recorded  holograms  (i.e., 
uniformity  of  the  diffraction  efficiency  for  each  hologram)  was  not  easy  to  control.  In  order  to  solve  this  problem,  we  propose 
and  report  in  this  paper  the  design  fabrication  and  testing  of  a  matrix  of  64x64  CGHs  using  a  novel  encoding  method.  In  the 
following  section,  we  describe  the  realization  of  the  HT  CGH  array. 

3.  DESIGN  AND  FABRICATION  OF  THF.  HT  CGH  MATRIX 


3.1.  Principle  of  HT  CGH  design 

Diffractive  optics  has  made  significant  progress  during  the  recent  years  due  to  the  advances  in  microfabrication  technologies 
and  developments  in  rigorous  numerical  modeling  as  well  as  CGH  iterative  design  algorithms  such  as  iterative  Fourier 

transform  algorithm  (IFTA)24*25  and  direct  binary  search  (DBS)26.  These  advances  make  high  performance  diffractive  optical 
elements  possible. 

The  HT  implementation  using  an  array  of  CGH  elements  imposes  some  constraints.  For  example,  each  hologram  must  have 
the  same  performance  characteristics  (diffraction  efficiency,  uniformity  of  the  reconstructions  in  intensity,  and  high  contrast); 
for  most  applications,  the  number  of  holograms  must  be  equal  to  the  input  plane  resolution,  with  the  size  of  each  hologram 
determined  by  the  chosen  resolution  in  the  parameter  plane.  For  this  research,  a  64  x  64  CGH  matrix  was  realized, 
reconstructing  a  65x65  parameter  space,  enabling  a  larger  domain  of  application  in  comparison  with  other  studies27.  In  order 
to  obtain  a  maximal  diffraction  efficiency,  four-phase  level  Fourier  transform  holograms  have  been  employed.  The  absence  of 
the  complex  conjugate  order  in  the  reconstructions  allows  us  to  use  a  smaller  space-bandwidth  product  of  the  holograms 
(128x128  pixels  for  a  65x65  pixel  reconstruction  size).  These  holograms  were  designed  by  a  computer  using  a  novel  method 
based  on  combining  the  advantages  of  the  iterative  algorithms  IFTA  and  DBS. 

IFTA  is  an  iterative  technique  based  on  the  Gerchberg  &  Saxton  algorithm28.  The  principle  of  this  method24*25  consists  in 
performing  successive  iterations  between  the  object  image  plane  and  the  spatial  spectrum  plane,  each  followed  by  a  constraint 
on  the  CGH.  To  make  the  spectrum  quantization  process  easier,  it  needs  to  be  uniform,  which  is  achieved  in  our  algorithm  by 
employing  an  equalizer.  The  equalizer  uses  the  degrees  of  freedom  in  space  and  phase29. 
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The  DBS  algorithm  presented  by  Seldowitz  et  al.26  is  based  on  the  optimization  of  an  error  criterion  using  a  Monte  Carlo 
technique.  At  the  initial  stage  we  generate  a  random  hologram  and  compute  the  error  between  the  desired  image  and  the 
reconstruction  of  this  hologram.  The  pixels  of  the  random  CGH  are  then  scanned  and  changed  in  a  certain  order  and  a  new 
error  is  computed.  The  pixel's  value  modification  is  retained  only  if  the  error  decreases  and  the  error  criterion  is  therefore 
minimized.  In  comparison  with  I  FT  A,  the  DBS  algorithm  gives  better  results  in  term  of  the  error  of  reconstruction.  However 
it  is  much  more  time  consuming  than  IFTA  (by  a  factor  of  20  approximately).  In  fact,  for  the  HT  holograms  calculations,  we 
combine  the  speed  of  convergence  of  IFTA  with  the  accuracy  of  the  DBS  techniques.  The  algorithm  for  HT  CGH  design  can 
be  summarized  as  follows  :  first  an  iterative  spectral  equalizer  is  applied  to  the  desired  reconstruction  of  the  CGH.  Then  the 
hologram  is  calculated  using  IFTA  followed  by  a  few  DBS  iterations  which  allow  to  increase  the  signal  to  noise  ratio  (SNR). 
The  total  time  required  for  the  computation  of  a  64x64  array  of  holograms  is  about  one  month  with  an  IBM  RS  6000  340 
workstation,  with  the  mean  time  for  one  hologram  of  about  700  sec.  Computer  simulation  results  provide  a  mean  diffraction 
efficiency  of  about  50%,  with  a  standard  deviation  of  1.8%  and  an  SNR  in  intensity  of  250,  with  a  standard  deviation  of  87. 
Figure  4  shows  one  hologram  of  the  matrix  and  the  simulation  of  its  reconstruction. 


a)  b)  c) 

Figure  4:  a)  a  specific  PSF  of  the  FIT  matrix.  Desired  reconstruction  of  the  CGH  corresponding  to  the  illumination  of  one  specific 
point  with  coordinates  (18,15)  with  respect  to  the  origin  located  in  the  center  of  the  input  plane,  b)  4-level-phase  CGH.  c) 
Simulation  of  the  CGH  reconstruction . 

3.2.  Fabrication  of  the  HT  matrix  of  CGHs 

The  4-phase-level  Hough  transform  matrix  hologram  was  fabricated  on  a  quartz  substrate  (n=  1.457  @  0.633  pm)  using 
standard  microelectronics  fabrication  technologies:  electron  beam  lithography  defined  the  two  mask  patterns  that  are  necessary 
to  make  the  4-phase-level  elements,  photolithography  transferred  the  mask  patterns  into  photoresist  that  served  as  etching 
masks.  The  surface  relieves  in  the  quartz  substrate  was  etched  using  chemically  assisted  ion  beam  etching  with  CHF3  in  the 
etching  chamber.  The  two  etch  depths  were  measured  to  have  less  than  2%  errors.  The  feature  size  of  each  pixel  is  5  pm  and 
the  misalignment  between  the  two  patterns  was  about  0.2  -  0.4  pm.  Figure  5  shows  the  micrographs  of  the  fabricated  CGH 
taken  by  a  scanning  electron  microscope.  The  main  fabrication  error  was  due  to  over  exposure/development  for  the  second 
pattern  (n  phase  delay).  This  error  caused  the  linewidth  reduction  of  the  second  pattern  (see  Figure  5)  and  increased  the 
unwanted  zero  order  efficiency. 
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Figure  5:  SEM  micrograph  of  the  fabricated  Hough  Transform  hologram. 

4.  EXPERIMENTAL  RESULTS 

The  experimental  set-up  for  testing  the  matrix  of  HT  holograms  is  shown  schematically  in  Figure  6.  A  twisted  nematic 
liquid  crystal  spatial  light  modulator  (SLM)  controlled  by  a  VGA  signal  is  used  to  introduce  the  input  image  into  the  system. 
The  input  image  display  with  this  SLM  is  more  accurate  in  comparison  with  a  RGB-video,  because  all  the  pixels  are 
individually  addressed.  A  system  of  imaging  lenses  is  also  used  to  magnify  and  display  the  input  plane  onto  the  HT  matrix  of 
CGHs.  A  Fourier  Transform  lens  reconstructs  the  holograms  and  an  acquisition  system,  consisting  of  a  CCD  camera  and  a 
PC  with  a  digital  frame  grabber,  is  used  to  visualize  resulting  reconstructions  in  the  output  parameter  plane. 


Figure  6:  Experimental  set-up  of  the  optical  HT  processor. 


The  resulting  output  parameter  domain  is  in  Cartesian  coordinates  format  (0,p),  where  6  and  p  vary  in  ranges  [0,  zr] ,  and 
[—■ y/2D,V2Dj  respectively,  where  D  is  the  half  size  of  the  input  image.  For  an  input  plane  consisting  of  64  x  64  pixels,  the 
output  plane  is  a  65  x  65  accumulator  cell  array  where  the  range  of  variation  for  p  is  [-45.25,45.25],  and  the  pand  0 
resolution  values  are  -Jl  and  jc/64  (2.81°)  respectively. 
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Figure  7  shows  experimental  results  on  generation  of  HT  for  various  examples  of  the  input  patterns.  Each  output  picture 
displays  undesirable  zero  order  resulting  from  imperfections  in  the  hologram  reconstruction.  In  fact,  the  presence  of  this  peak 
is  due  to  etching  errors  during  the  fabrication  process.  It  should  also  be  noted  that  the  CGHs  were  designed  such  that  the 
central  order  is  not  located  on  the  central  accumulator  cell  position  in  the  reconstruction.  Figures  7a  and  7b  show  the 
reconstruction  of  a  single  hologram.  Figure  7a,  shows  the  reconstruction  of  the  CGH  corresponding  to  the  central  position  in 
the  input  plane  (^pO^y^O).  In  this  case,  the  sinusoidal  curve  described  by  the  equation  p  =  arocos0  +  yosin0  becomes  a 

horizontal  line  defined  by  the  relation  p=0.  Figure  7c  displays  the  reconstruction  of  the  global  parameter  plane  when  the  entire 
matrix  of  CGHs  is  illuminated.  The  intensity  distribution  in  the  output  plane  is  the  sum  of  all  the  reconstructed  holograms. 
The  intensity  is  higher  near  the  central  axis  of  the  matrix  because  it  is  the  location  where  a  maximum  of  curves  are 
superimposed.  It  should  be  pointed  out  that  as  expected  the  parameters  domain  is  regularly  mapped  of  illuminated  square  cells. 
Figures  7d  and  7e  show  a  centered  straight  line  with  a  135°  slope  and  its  HT  response,  respectively.  All  the  sinusoids  intersect 
in  the  same  point  with  coordinates  6=45°  and  p=0,  correctly  determining  the  parameters  of  the  normal  vector  to  the  single 
input  line.  Figure  7g  shows  the  parameter  domain  for  a  series  of  parallel  straight  lines  of  different  lengths  shown  in  Figure 
7f.  We  notice,  in  this  case  that  the  different  points  are  aligned  on  a  vertical  line  corresponding  to  0=45°  and  that  the  light 
intensity  is  varying  proportionally  to  the  length  of  the  straight  lines  in  the  input  plane,  indicating  that  with  variable 
threshold,  we  can  detect  lines  of  variable  length  and  also  estimate  the  length  of  each  line  from  the  values  of  these  thresholds. 
Figures  7h  to  7k  provide  an  example  of  a  generating  HT  for  higher  order  parametric  curves.  The  summation  of  the  sinusoidal 
curves  forms  in  the  output  plane  a  specific  pattern:  a  circle  generates  a  slack  strip  (Figure  7i),  whereas  a  centered  ellipse 
generates  a  flattened  strip  (Figure  7k).  The  evaluation  of  the  p  mean  positions  and  the  width  of  the  strip  on  the  0—0°  and 
0=90°  columns  gives  information  about  the  location  of  the  center  in  the  input  plane  as  well  as  the  diameter  or  dimensions  of 
the  ellipse  axis.  We  notice  that  our  optical  HT  processor  detects  circle  as  well  as  ellipse  with  good  accuracy  (see  Figures  7i 
and  7k). 


c) 


Figure  1:  Output  planes  for  different  input  images:  a)  PSF  of  the  hologram  of  coordinate  (0,0)  in  the  matrix;  b)  PSF  of  one  single 
hologram  in  the  array;  c)  global  illumination  of  holograms  matrix. 
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Figure  7  (com  d):  Output  planes  for  different  input  images  :  e)  photograph  of  the  parameter  domain  for  a  straight  line  with  parameters 
6=45°  and  p=0  in  the  input  plane  (d));  f)  series  of  parallel  straight  lines  with  a  135°  slope  in  the  input  plane;  g)  HT  of  the  series  of 
straight  lines  of  (e)  ;  h)  circle  in  the  input  plane  :  x0~8,  y0=-9,  2r~21;  i)  output  plane  for  a  circle  in  (h),  the  width  of  the  pattern  for 
0=0°  corresponds  to  the  diameter  of  the  circle.  The  measured  parameters  :  x0~7.1 ,  y0--8.5,  2r-I9.8;  j)  ellipse  in  the  input  plane: 
xo=°>  a=20,  h-40;  k)  output  plane  for  an  ellipse  in  ( j ).  The  dimensions  of  the  axis  (a.b)  are  read  on  the  0°  and  90°-  columns, 

whereas  the  location  of  the  ellipse  center  (x^yj  is  determined  by  the  p-mean  position  of  the  strip  on  the  0°  and  90°  columns .  From 
the  measured  parameters,  we  estimate  :  x0  =0,  yo=0,  a=19.8,  b=4l. 


5.  CONCLUSION 


An  optical  HT  processor  that  uses  a  four-level-phase  matrix  of  64x64  CGHs  is  presented.  Each  element  in  the  array  is  a 
128x128  pixels  CGH  which  is  designed  using  a  novel  algorithm  that  combines  the  high  speed  of  convergence  of  iterative 
Fourier  algorithm  (IFTA)  with  the  high  precision  of  the  direct  binary  search  (DBS)  algorithm.  Compared  to  off-axis  optically 
recorded  thick  holograms,  the  on-axis  4-level-phase  CGHs  matrix  introduces  a  great  advantage  into  our  system:  the  absence  of 
complex  conjugate  orders  in  the  reconstruction  allows  for  a  smaller  space  bandwidth  requirements  from  the  holograms  and 
simpler  system  packaging.  The  holograms  were  fabricated  using  micro-lithography  and  chemically  assisted  ion  beam  etching 
techniques.  The  holograms  generate  a  normal  parameter  HT  space  (p,0)  with  resolution  of  65x65  pixels.  The  detection  of  line 
segments  within  a  0°  to  180°  angle  range  is  performed  with  a  2.81°  resolution.  The  HT  processor  was  applied  to  extract 
parameters  of  various  parametric  curves  such  as  single  lines,  parallel  lines,  circle,  ellipse.  The  measured  parameters  are  found 
in  excellent  agreement  with  the  actual  inputs.  This  41x41  mm  diffractive  optical  element  is  one  of  the  largest  size  matrix  of 
space  variant  CGHs  ever  created  until  now.  The  experimental  results  show  that  our  on-axis  processor  is  an  attractive 
alternative  to  existing  optical  and  electronic  architectures  for  implementing  HT  algorithms  for  real  time  applications. 
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ABSTRACT 

We  present  a  study  on  a  high-speed  optoelectronic  system  for  implementing  space  variant  transforms  (SVT)  in  image  and 
signal  processing  using  a  Hough  Transform  (HT)  as  an  example.  The  HT  has  been  found  to  be  highly  useful  in  applications 
requiring  detection  of  lines,  ellipses  and  hyperbolic  shapes,  such  as  radar  detection  and  data  fusion,  topographical  map 
analysis,  etc.  However,  the  implementation  of  a  SVT  such  as  HT,  is  computation  and  memoiy  intensive,  e.g.  HT  of  an  image 
of  dimension  N  x  N  requires  greater  than  N 3  operations.  All-electronic  systems  remain  inadequate  when  real  time  SVT 
processing  of  large  data  sets  is  required.  In  this  paper  we  show  that  an  optoelectronic  (OE)  system  employing  parallel 
processing  can  perform  such  SVT  requiring  on  the  order  of  only  N  steps.  We  show  that  our  proposed  OE  system  can  HT  an 
input  image  of  dimension  iV  =  1 024  in  2.1  ms. 

Keywords:  Space-variant  transforms,  Hough  transforms,  optoelectronic  system,  computer-generated  hologram,  image  and 
signal  processing. 


1.  INTRODUCTION 

Space-variant  transforms  (SVT),  such  as  the  Hough  transform 1  (HT),  are  useful  in  many  image  processing  applications 
including  radar  detection2  and  data  fusion3,  topographical  map  analysis4  and  autonomous  robot  control5.  However,  the 
implementation  of  SVT  is  computation  and  memory  intensive6.  Therefore,  for  efficient  processing,  all-electronic 
implementations  have  been  developed  using  parallel  multi-processor,  such  as  pyramid7,  mesh8,  multi-ring9  and  systolic 
array  ,  computer  systems. 

For  SVT  an  attractive  alternative  to  such  all-electronic  systems  are  optical  systems  that  take  advantage  of  the  inherent 
parallelism  of  optics.  Several  such  systems  have  been  proposed,  including  the  use  of  cylindrical"  and  micro-lenses12, 
rotationally  multiplexed"  and  computer  generated  holograms14  (CGH).  For  digital  processing,  these  systems  utilize  spatial 
light  modulator  (SLM)  and  detector  (e.g.  CCD)  arrays  for  image  input  and  output  (I/O),  respectively.  We  have  implemented 
such  a  HT  processing  system  using  a  matrix  of  CGH l5.  Each  hologram  in  the  array  is  designed  to  map  a  specific  image  pixel, 
displayed  on  an  SLM,  to  the  entire  CCD  pixel  array.  The  HT  for  the  entire  image  is  pre-computed  and  pre-stored  in  the  form 
of  a  CGH  array,  which  is  later  accessed  in  parallel  as  an  array  of  space-variant  impulse  response  holograms.  Therefore,  the 
optical  interconnection  hologram  can  be  seen  as  a  page  oriented  optical  memory.  The  speed  and  efficiency  of  the  SLM  and 
detector  technology  utilized  dictate  the  SVT  image  processing  time. 

The  realization  of  high-resolution  real-time  systems  may  require  optoelectronic  (OE)  processing  systems  that  utilize  fast 
CMOS  driver  technology.  In  such  a  system,  a  VCSEL  array  and  a  smart-pixel  focal  plane  array  can  be  used  for  I/O, 
respectively,  at  high  clock  rates.  In  this  paper  we  investigate  the  practical  implementation  of  such  a  system.  In  Section  2  we 
review  space  variant  transforms  and  line  detection  using  the  Hough  transform.  In  Section  3  we  discuss  a  high-speed 
optoelectronic  implementation  of  such  transforms  and  in  Section  4  we  define  the  performance  metrics  we  use  to  characterize 
the  system.  In  Section  5  we  compare  the  optoelectronic  system  to  all-electronic  parallel  systems.  In  the  last  section  we 
present  a  summary  and  conclusions. 
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2.  SPACE  VARIANT  TRANSFORMS 


A  SVT  affects  each  input  point  differently,  and  can  be  defined  by  a  2-D  superposition  integral 

F(x,y)  =  J  /_/(£  V)h(x,y\ £  rf)d^di) , 


(1) 


where  /(£,7)  is  the  input  image  to  the  SVT  filter,  F(x,y>)  is  the  output  image  and  h(x,y;^yTj)  is  the  SVT  filter  impulse 
response.  For  SVT,  the  filter  is  dependent  on  the  absolute  position  in  the  input  plane  and  varies  in  the  observation  output 
plane.  Examples  of  SVT  used  in  machine  vision  and  pattern  recognition  include  the  log-polar  l6,  reciprocal  wedge17  and  HT. 


In  the  HT  each  point  from  the  image  (input)  domain  is  mapped  into  a  parametric  curve  in  the  output  (parameter)  domain. 
As  an  example  we  review  a  HT  employed  for  detection  of  a  straight  line  in  normal  parameterization.  A  straight  line  in  the 
input  domain  is  described  by 


( x,y )  e/?0  =  xcos#0  +>?sin6>0 
othe  rwise 


(2) 


where  p0  is  the  shortest  distance  to  the  origin  and  O0  is  the  direction  of  p0  (see  Figure  la).  Therefore,  the  parameter  domain 
is  the  distance-angle  plane  (#,/?)  with  Cartesian  coordinates  axes  p  and  0  (see  Figure  lb).  In  this  case  each  input  plane  point 
will  be  mapped  into  a  sinusoidal  curve  in  the  ( 6,p )  domain 


p-XiCosB+yfi'mO.  (3) 

Put  in  terms  of  the  SVT  of,  Using  Eq.  (3)  for  designing  the  SVT  filter,  Eq.  (1)  can  be  rewritten  yielding  output  parameter 
domain, 


F(/?,#)  =  JJ  S(p- x cos&- y smO) f (x, y)dxcfy .  (4) 


Figure  1.  Normal  parameterization  of  a  straight  line  where  the  (a)  input  image  plane  has  three  points  along 
a  line  and  the  (b)  parameter  plane  has  three  corresponding  curves  intersecting  at  (0O » Po )  ■ 

Each  point  in  the  input  plane  generates  a  sinusoidal  curve  in  the  output  domain  that  spans  all  of  the  lines  that  may  pass 
through  that  point.  Sinusoidal  curves  that  intersect  in  the  parameter  plane  describe  a  line  passing  through  at  least  two  points 
in  the  input  plane  (see  Figure  lb).  Therefore,  multiple  points  lying  along  the  same  line  described  by  (<90, p0)  in  the  input  will 
result  in  multiple  curves  intersecting  in  the  parameter  plane  at  the  point  (&0,pQ).  By  taking  the  sum  of  the  intersecting  points, 
which  corresponds  to  a  measurement  of  intensity  in  an  optical  system,  the  strength  (or  number  of  points)  on  that  line  is 
determined.  Higher  order  parameterizations  have  also  been  developed  that  allow  for  the  2-D  detection  of  a  variety  of 
geometric  shapes  such  as  ellipses18  and  hyperbolic  curves19  as  well  as  the  iterative  detection  of  3-D  objects20. 
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3.  OPTOELECTRONIC  SYSTEM  IMPLEMENTATION 


We  have  demonstrated  an  OE  processing  system  using  a  256x256  matrix  of  CGH  filters  for  Hough  transform,  coordinate 
transform,  and  optical  interconnects15.  To  construct  the  matrix  of  holograms,  the  CGH  encoding  of  the  SVT  for  each  pixel 
position  was  displayed  on  a  Hughes  liquid  crystal  light  valve  (LCLV)  and  optically  recorded,  in  the  Fourier  transform  plane, 
onto  Silver  Halide  holographic  film.  The  image  processing  system,  shown  in  Figure  2,  used  the  LCLV  to  generate  the  input 
image  plane,  which  is  imaged  onto  the  matrix  of  holograms.  The  output  is  reconstructed  onto  the  CCD  camera  using  a 
Fourier  lens. 


Figure  2.  Optoelectronic  system  for  space-variant  image  processing.  A  CRT  and  laser  are  used  to  generate 
the  real-time  image  from  the  LCLV.  The  outputs  from  the  holograms  are  reconstructed  off-axis  onto  the 
CCD  camera  using  a  Fourier  lens. 

We  have  also  demonstrated  a  HT  image  processing  system,  for  detection  of  straight  lines  and  ellipses,  using  a  64x64 
matrix  of  computer  generated  holograms  constructed  using  standard  micro-lithography  and  fabrication  techniques21.  To 
efficiently  encode  each  CGH  for  the  Fourier  transform  of  the  desired  filter,  we  used  a  modified  version  of  the  direct  binary 
search  (DBS)  method  combined  with  the  iterative  Fourier  transform  algorithm22.  The  four-phase  level  CGH  was  fabricated 
using  a  combination  of  e-beam  and  photo-lithography  and  chemically  assisted  ion  beam  etching.  Each  hologram  contains 
128x128  pixels,  which  are  each  5|imx5pm.  The  micro-fabricated  CGH  have  the  advantage  of  uniform  holographic  recording 
of  each  pixel  as  well  as  flexibility  of  CGH  design,  e.g.  allowing  for  on-axis  reconstruction. 

Next  generation  of  optoelectronic  systems  will  require  fast  I/O  to  perform  real-time  processing  of  high-resolution  images 
on  the  order  of  1024x1024  input  pixels.  Using  VCSEL  and  smart  focal  plane  arrays  flip-chip  bonded  to  CMOS  drivers  allow 
operation  at  low  power  with  I/O  clock  rates  greater  than  150  MHz23.  The  SVT  of  the  input  image  to  the  output  image  is 
performed  using  diffractive  and  micro-optics.  By  using  the  same  element  spacing,  the  VCSEL  and  hologram  arrays  can  be 
placed  directly  next  to  each  other  (see  Figure  3).  The  refractive  lens  may  also  be  replaced  with  an  array  of  micro-lenses. 
Therefore,  the  components  and  subsystems  can  be  packaged  into  a  compact  and  rigid  optoelectronic  system.  Efficient 
packaging  will  also  be  able  to  take  advantage  of  in-situ  recording  and  self-assembly  techniques. 


VCSEL 


lens 


of  CGH 


smart-pixel 

array 


Figure  3.  Optoelectronic  SVT  image  processing  system  uses  VCSEL  and  smart-pixel  arrays  bonded  to 
CMOS  drivers  for  fast  I/O.  The  reconstruction  lens  can  be  replaced  by  diffractive  micro-lenses,  allowing 
for  a  more  integrated  and  compact  package. 
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The  VCSEL  array  is  expected  to  act  as  a  partially  spatially  coherent  quasi-monochromatic  source.  A  unique  feature  of 
using  these  VCSEL  arrays,  as  opposed  to  traditionally  employed  SLM  implementations,  is  improvement  of  the  SNR 
performance  due  to  incoherent  superposition  at  the  output  (e.g.  reduced  speckle  noise).  Tunability  of  VCSEL  arrays  would 
also  enable  using  wavelength  multiplexing  of  volume  holography  to  pre-store  multiple  transforms,  thereby  allowing 
reconfiguration  for  efficiently  implementing  a  number  of  SVT  algorithms.  Since  such  systems  are  generally  used  for  line  and 
shape  detection  (where  missing  pixels  have  a  small  effect  on  detection  efficiency),  relatively  low  yield  VCSEL  arrays  can  be 
tolerated  allowing  for  the  use  of  large  scale  arrays. 

4.  PERFORMANCE  METRICS 

In  order  to  analyze  the  capabilities  of  such  an  OE  processing  system  we  define  a  set  of  performance  metrics  that  can  be 
compared  to  other  existing  SVT  image  processing  systems.  We  begin  by  defining  the  dimensions  of  our  system.  Let  the 
image  (input)  domain  be  of  size  NxN,  the  transform  domain  is  Kx  K  and  the  parameter  (output)  domain  be  of  size  Mx  M 
(see  Figure  4),  where  for  simplicity  we  have  assumed  symmetric  arrays.  The  VCSEL  (SLM)  array  is  of  size  FxF  (i.e.  the 
input  image  can  have  a  different  dimension  than  the  VCSEL  array)  and  the  smart-pixel  (detector)  array  is  DxD,  then  for  a 
fully  space  variant  transform  F  =  K  and  F<  D. 


Domain  Domain  Domain 

Figure  4.  Dimensions  of  the  image,  transform  and  parameter  domains. 

The  dimensions  Fand  K \  the  input  image  and  output  planes,  will  be  constrained  by  the  attainable  pixel  size  of  the  CGH 
elements.  Assuming  the  output  from  each  VCSEL  fills  the  area  of  its  corresponding  CGH,  then  the  2-D  space  bandwidth 
product  (SBWP)  of  a  single  hologram  can  be  given  by  SBWPsingie  =  D2C 2  where  C  is  the  spatial  encoding  (e.g.  for  a  64x64 

image,  a  single  hologram  with  128x128  pixels  results  in  a  spatial  encoding  C  =  2).  Therefore,  for  the  entire  matrix  of 
holograms  we  obtain 

SBWP  =  K2  D2C2.  (5) 

We  can  also  define  the  SBWP  in  terms  of  the  physical  dimensions  of  the  matrix  of  holograms  so  that 

SBWP  =  (L)  ,  (6) 

where  lh  is  the  length  of  the  hologram  matrix  and  8  is  the  size  of  each  pixel  in  the  CGH  (minimum  feature  size).  If  we 
assume  a  SVT  where  K  =  D,  then  we  can  say 


For  example,  if  we  assume  that  the  matrix  of  holograms  has  a  length  lh  =  100  mm ,  a  minimum  feature  size  8  -  1  pm  and 
encoding  C  =  6 ,  then  K,  D  ~  128  (i.e.  the  input/output  can  be  of  maximum  dimension  128  xl28). 
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The  length  of  the  system  will  depend  on  the  size  of  each  component  as  well  as  the  distance  between  components.  Assume 
each  VCSEL  has  an  aperture  a  and  a  pitch  p,  then  the  distance  between  the  VCSEL  array  and  holograms  can  be 
approximated  by 


ZCGH 


=  p1a 

2A 


(assuming  the  VCSEL  array  pitch  is  equal  to  the  pitch  of  the  hologram  array,  then  p  =  —  ).  The  distance  from  the 

K 

reconstructing  lenses  to  the  output  plane  is  given  by 

Zo=f  =  dF/#,  (9) 

where  /is  the  focal  length  (if  we  assume  a  refractive  lens,  then  the  diameter  d>4llh ).  Since  each  pixel  is  Fourier 

transformed  onto  the  output  plane,  we  can  describe  the  size  of  the  detector  array  by  the  relation  ld  =  — .  The  size  of  each 

8 

detector  in  the  smart-pixel  array  is  Sd  =  —  and  the  active  detection  area  is  restricted  to  <  S2d  (we  assume  a  densely  packed 
detector  array  with  processing  performed  outside  the  detection  area). 


The  product  of  its  overall  length  and  width  gives  the  footprint  of  the  system 


z  *  ^  zcgh  )(/max  >  00) 

where  ty ,  tfo  tj  and  t ^  are  the  thickness  of  the  VCSEL,  hologram  and  smart-pixel  detector  arrays  and  the  refractive  lens, 
respectively.  Substituting  from  the  relations  above  we  get 


)(/max  )} » 


f  ^  fl  CSa  ^ 

Sf  =  \  ty  + - - +  th+tR+yf2lhF/#+td  (lh)>. 


where  we  assume  the  maximum  width  to  be  of  order  lh.  If  we  continue  to  assume  a  symmetric  system,  then  the  volume  of  the 
system  is  defined  by 


Sy  H  thv  + 


+thh+thR  +  j2lhF/#+thd  (4)2  . 


Let^  assume  the  VCSEL  and  smart-pixel  arrays  have  thickness  of  0.5  cm  (including  CMOS  layer),  the  holographic  array  has 
thickness  0.1  cm  and  the  refractive  lens  has  a  thickness  of  1.0  cm.  Also,  we  assume  F/l,  /l  =  0.980  //m,  8=  1/vm, 
K,D-  128  and  C  =  6.  The  hologram  width  will  be  lh  =10  cm,  which  gives  a  focal  length  /~14cm.  If  the  VCSEL 
diameter  is  a  -  3  pm,  then  the  distance  between  the  VCSEL  and  hologram  array  zQ  - 1.1  mm  .  The  total  length  of  the  system 
is  z  ~  16  cm ,  which  results  in  a  footprint  of  SF  ~  160  cm2  and  volume  of  Sv  ~  1600  cm3.  From  Eq.  (11)  we  see  that  the 
system  dimensions  primarily  depend  on  the  size  of  each  pixel,  dictated  by  the  minimum  attainable  feature  size  of  the 
diffractive  optical  element  (DOE),  along  with  the  F/#  of  the  reconstruction  lens.  For  example,  reducing  the  pixel  size  to 
8  =  0.5  pm  will  reduce  the  total  length  to  z  =  9  cm ,  the  footprint  to  SF  ~  45  cm2  and  the  volume  to  Sv  ~ 225  cm3.  On  the 
other  hand,  using  a  F/2  lens  will  almost  double  the  system  dimensions.  An  attractive  alternative  to  the  thick  refractive  lens  is 
to  use  a  relatively  thin  diffractive  lens  (i.e.  CGH)  to  perform  the  Fourier  transform  from  the  HT  plane  to  the  output  plane. 

To  analyze  the  power  requirements  of  the  system  we  begin  with  the  smart-pixel  detectors.  We  can  approximate  the  power 
requirements  of  each  pixel  by  the  relation 

_  1  (C^dV  \ 

^det  ”  +7leak  (13) 

where  R  is  the  responsivity.  If  we  assume  a  leakage  current  /leak  =  1 00  pA ,  detector  capacitance  Cdet=30fF  and 
responsivity  R  =  0.3  then  to  get  10  mV  at  500  MHz  clock  rate  should  require  an  incident  optical  power  of 
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Pdet  a  0.5  /AV.  Dines24  has  shown  that  smart  pixels  operating  at  greater  than  100  MHz  clock  rate  require  a  detector  area  of 
only  50  /vmx20  /rm  and  less  than  10  0  of  incident  energy.  At  a  500  MHz  clock  rate  this  corresponds  to  Pdet  «  5  /rW .  Since 
each  VCSEL  needs  to  power  a  minimum  of  128  smart-pixels  in  the  detector  array,  then  assuming  50%  efficiency  of  the  DOE, 
the  required  optical  power  from  the  VCSEL  is  between  Py  —  0.1  - 1.0  mW .  For  low  threshold  VCSELs  we  can  assume  a 
slope  efficiency  of  55%  and  a  threshold  current  1T  =212  //A.  Using  the  conservative  upper-bound  estimate  Pv  =  1  mW ,  then 
the  required  current  per  VCSEL  is  /  =  760  /rA .  Using  3V  CMOS  driving  circuitry  would  require  2.3  mW  of  driving  power. 
Since  only  one  row  is  addressed  at  any  one  time,  then  for  N  =  128  where  every  pixel  is  in  the  bn  state  this  corresponds  to 
P  <  0.3  W. 

To  determine  the  thermal  performance  of  such  a  system  we  assume  that  the  0.3  W  of  power  will  need  to  be  dissipated  on 
the  end  surfaces.  For  lh  =  10  cm ,  the  surface  area  of  each  end  is  SE  =  100  cm2 .  Therefore  the  VCSEL  array  will  need  to 
dissipate  approximately  3  ~ .  We  can  assume  that  the  smart-pixel  array  will  need  to  dissipate  the  same  order  ofpower” 

cm 

over  a  similar  surface  area  (on  the  opposite  side  of  the  OE  system). 

The  performance  metric  that  can  most  easily  be  compared  to  alternative  systems  is  the  amount  of  time  required  to  SVT 
process  a  single  input  image.  For  a  CMOS  driver  operating  at  clock  rate  f  we  can  assume  that  the  VCSEL  response  time  is 

on  the  order  of  r  =  — .  Using  parallel-matrix  addressing,  i.e.  where  only  one  pixel  per  row  can  be  addressed  at  a  time,  then  the 

K  Z)  +  P 

load  time  of  an  input  image  is  tk  —  — .  The  smart-pixel  detection  time  can  be  described  by  td  =  -j  ,  where  D  is  the 

number  of  rows  in  the  detector  and  P  is  the  number  of  clock  cycles  required  to  perform  some  local  processing  and  A-D 
conversion.  The  total  processing  time  for  N  =  K  is  then  given  by  the  sum 

,  K  +  D  +  P 


For  a  system  where  the  dimension  of  the  input  domain  is  greater  than  the  Filter  domain,  N  >  K  (e.g.  a  large  input  image  or 
wavelength  multiplexed  processing),  then  the  system  must  perform  some  block-serial  mapping  and  can  be  described  by 


K  +  D+P 


Similarly,  for  a  parameter  domain  larger  than  the  Filter  domain,  M>  D  (e.g.  for  an  asymmetric  SVT),  then  the  total 
processing  time  can  be  given  by 


K+D+P  NM 


For  large  image  sizes  we  can  assume  that  the  number  of  clock  cycles  required  for  local  processing  is  small  compared  to 
the  image  size,  i.e.  P  «  K,D .  There  are  three  cases  that  we  will  examine: 

1.  The  input  and  output  dimensions  are  equal  to  (or  smaller)  than  the  filter  domain,  i.e.  N  =  K  =  D  —  M ,  then  the  total 
processing  time  reduces  to 

r,~  — .  (17) 

/ 

2.  The  input  dimension  is  larger  than  the  filter  dimension,  i.e.  N  >  K ,  then 

3.  The  input  and  output  dimensions  are  larger  than  the  filter  dimension,  i.e.  N  >  K  and  M  >  D  .  However,  we  will  assume 
that  the  input  and  output  are  of  the  same  dimension,  i.e.  N  =  M  and  K  =  D ,  then 


(NX 

~U> 
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In  Figure  5  we  have  plotted  the  processing  time  for  each  of  the  three  cases  discussed  and  we  see  that  there  is  an  exponential 
increase  in  the  processing  time  when  the  input/output  sizes  are  larger  than  the  holographic  filter.  For  example,  assume  an 
input  image  of  512x512  and  CMOS  clock  rate  of  500  MHz.  If  the  matrix  of  holograms  is  also  512x512  then  the  total 
processing  time  will  be  7j=2.0//s.  However,  for  a  SVT  filter  of  size  128x128,  the  processing  time  is  increased  to 
T3  =  130  (where  the  output  is  of  dimension  512x512). 
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Figure  5.  Log  of  computing  time  in  units  of  K/f  as  a  function  of  N/K . 

Another  convenient  measure  of  performance  is  throughput,  which  is  defined  as  the  number  of  pixels  processed  per  second 
and  for  an  NxN  image  can  be  given  by 


Th  =  ~=N2 
T 

Again  we  assume  P  «  K,D ,  where  our  three  cases  are: 


/  Y  KD 

k+d+pKnm 


1 .  N  =  K  -  D=  M ,  then  the  throughput  reduces  to 


2.  W>*:,then 


3.  N  >  K ,  M  >  D ,  N  =  M  and  K  =  D ,  then 


(23, 

Using  the  same  example  as  above,  i.e.  512x512  image  and  500  MHz  clock  rate,  if  the  matrix  of  holograms  is  also  512x512 

then  the  total  throughput  will  be  Th,  =1.2x10"  *S  and  for  a  filter  of  size  128x128,  the  throughput  is 

sec 

77,,  =2.0X10’ 


Finally,  for  a  binary  input  image  we  can  describe  the  number  of  binary  operations  per  second  or  bandwidth  by  the  relation 


o=^=n2m> 


f  VKD- 
K+D+PANM 


and  if  we  assume  that  K  =  D  and  P  «  K,D ,  then  this  simplifies  to 
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(25) 


For  a  filter  of  size  128x128  image  and  500  MHz  clock  rate  the  number  of  binary  operations  is  0~  5.2  xlO14 
bandwidth  is  representative  of  the  parallelism  of  an  optics-based  processing  system. 


- .  This  large 

sec 


5.  COMPARISON 


For  comparison  of  our  proposed  OE  SVT  image  processing  system  to  all-electronic  systems,  we  will  assume  a  relatively 
conservative  OE  system  configuration  that  has  a  128x128  VCSEL  array  for  input,  a  128x128  array  of  holograms  for  SVT 
filtering,  a  128x128  smart-pixel  detector  array  and  CMOS  circuitry  operating  at  a  500  MHz  clock  rate.  For  such  a  system  a 
512x512  image  will  be  SVT  processed  in  approximately  130  jis  and  a  1024x1024  input  image  will  be  processed  in  1.0  ms.  A 
proposed  OE  system  using  ferroelectric  liquid  crystal11  (FLC)  input  devices  and  CGH  encoded  HT  filters  is  capable  of 
processing  a  512x512  image  in  1  ms,  limited  by  the  speed  of  the  SLM. 

The  all-electronic  SYMPATI2,  developed  in  1992  by  CEA/LETI/DEIN,  which  uses  the  single  instruction  multiple  data 
(SIMD)  architecture,  employs  128  parallel  processing  elements  (PE)  reporting27  the  HT  image  processing  of  a  512x512  input 
image  (for  512  angles)  in  approximately  1  second.  More  recently,  a  HT  board,  utilizing  a  reconfigurable  parallel  architecture 
and  two  content  addressable  memory  (CAM)  chips,  reported28  line  extraction  of  a  256x256  input  image  in  76  ms.  Using  a 
reconfigurable  multi-ring  network  (RMRN)  architecture  where  each  node  is  a  T9000  transputer,  the  HT  operation  on  a 
1024x1024  input  image  is  estimated9  at  14  seconds,  using  128  nodes.  Performing  the  HT  for  straight-line  detection  in  a 
512x512  input  image  (with  2048  angles)  using  the  Systola  1024 ,  a  systolic  array  of  1024  processors  implemented  on  a 
standard  PCI  board,  took10  0.25  seconds.  This  corresponds  to  a  processing  time  of  62.5  ms  for  512  angles. 

An  alternative  to  these  electronic  DSP  methods  for  HT  processing  is  a  strictly  memory  lookup  method.  To  analyze  the 
performance  for  a  HT  processing  system  that  uses  a  lookup  table,  i.e.  storing  the  HT  (d,p)  results  for  all  (x,j/)  input  points, 

we  begin  by  defining  the  dimensions  of  the  storage  required.  Assuming  a  binary  NxN  input  image,  then  the  input  is  N2  bits. 
Each  input  point  maps  a  sinusoidal  curve  to  the  output,  i.e.  N  angles  0.  For  6  each  we  need  a  word  of  length  log2  N  to 
distinguish  between  N  possible  p.  Therefore,  the  total  storage  required  is 

S  =  N3  log2  N  Bits.  (26) 


Since  each  angle  is  independent,  we  divide  the  transforms  by  angle  into  multiple  memory  chips  (R)  (i.e.  each  chip  determines 
N 

6-  —  angles),  giving  the  storage  for  each  chip 
R 


S  = 


N 3  log2  N 
R 


Bits. 


For  N  =  1024 ,  the  storage  required  is 


1010 

R 


bits. 


For  a  lookup  method,  the  HT  can  be  described  by  the  following  steps: 

1 .  Read  input  (*,>>)  value,  assume  1  clock  cycle. 

2.  Read  HT  (0,p)  value,  1  clock  cycle. 

3.  Read  existing  output  array  value  (^oid’Poid)*  1  clock  cycle. 

4.  Increment  output  array  value  (#old,/?old),  assume  2  clock  cycles. 

5.  Write  new  output  array  value  (<9new>  Aiew)>  1  clock  cycle. 

Therefore,  we  can  estimate  the  total  time  required  processing  of  an  NxN  input  image  to  be 

r„6A^  log 2^ 

Rf  ’ 


(27) 


(28) 
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where /is  the  clock  rate  of  the  memory  chips  and  we  assume  that  R<  N .  For  comparisons  to  alternative  systems  we  will 
assume  our  memory  to  be  Rambus  RDRAM  chips,  capable  of  operating  on  a  word  (-log-,  A)  in  parallel  at  1  Giga  word  per 
second  average  clock  rate.  Therefore,  for  N  =  512 ,  and  assuming  R  =  16,  the  total  processing  time  is  T -  0.05  seconds  and  for 
N  - 1024  the  processing  time  is  T  ~  0.4  seconds.  To  compare  this  lookup  method  with  our  proposed  OE  system  we  plot  the 
processing  times,  using  Eqs.  (17),  (18)  and  (28),  in  Figure  6.  Comparing  ideal  systems,  where  N  =  K,R  the  OE  system 
increases  its  advantage  in  processing  speed  as  the  dimensions  of  the  input  image  increase.  However,  for  an  OE  system  with  a 
small  number  of  holograms,  the  exponential  increase  in  processing  time  decreases  its  advantage  over  the  strictly  memory 
lookup  table  system  with  a  limited  number  of  memory  chips. 


Figure  6.  Log  plot  shows  processing  times  for  RDRAM  memory  resident  and  OE  processing  systems  as 
functions  of  input  image  dimension. 

In  Table  1  we  compare  the  processing  times  for  the  various  systems  discussed  above.  The  all-electronic  systems  can 
perform  the  HT  on  the  order  of  milliseconds  for  images  of  size  up  to  512x512  and  on  the  order  of  seconds  for  larger  input 
images.  On  the  other  hand,  our  proposed  OE  system  can  perform  the  HT  on  images  of  dimension  1024x1024  in  a  few 
milliseconds,  required  for  real-time  processing. 


N 

128 

256 

512 

1024  - 

SunSPARC20 

15s 

SYMPATI2 

24  ms 

1.5  s 

CAM 

76  ms 

RMRN 

14s 

Systola  1024 

62  ms 

RDRAM  /?  =  16 

0.79  ms 

6.3  ms 

50  ms 

0.40  s 

OE  K  =  m 

MMmmm 

8.2  |4.s 

0.13  ms 

2.1  ms 

Table  1.  Processing  times,  given  in  seconds,  for  various  SVT  image-processing  systems.  For  comparison 
we  computed  a  serial  implementation  of  the  HT  on  a  SunSPARC20  workstation  with  a  100  MHz  processor. 


6.  CONCLUSION 

An  optoelectronic  SVT  image  processing  system,  using  CMOS  driven  VCSEL  and  smart-pixels  for  I/O  and  a  filtering 
performed  by  a  matrix  of  holograms,  has  been  presented.  In  order  to  analyze  the  strengths  and  weaknesses  of  such  a  system 
we  defined  a  set  of  performance  metrics.  The  dimensions  of  the  OE  system  are  primarily  constrained  by  the  minimum  feature 
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sizes  attainable  using  micro-fabrication  techniques.  However,  using  relatively  conservative  estimates  our  proposed  system  is 
highly  compact.  Using  currently  available  technology  our  proposed  OE  processing  system  can  perform  HT  on  a  128x128 
image  in  0.51  ps  and  using  block-serial  mapping  for  on  a  1024x1024  input  image  a  HT  would  take  only  2.1  ms. 

The  exponential  growth  of  the  processing  times  for  an  OE  system  using  relatively  small  arrays  is  the  greatest  limitation  to 
SVT  processing  of  very-large  and  high-resolution  images.  However,  since  these  systems  can  tolerate  low  yield  VCSEL 
arrays,  or  even  stitching  of  multiple  small  arrays,  it  may  be  feasible  to  have  VCSEL  or  detector  arrays  on  the  order  of 
1024x1024.  For  an  input  image  size  of  1024x1024  into  a  SVT  on  an  OE  processing  system  with  VCSEL,  hologram  and 
smart-pixel  arrays  all  of  dimension  1024x1024  (stitched  or  low-yield),  the  processing  time  is  estimated  at  only  4.1  ps. 
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